Segmentation and classification of mixed text/graphics/image documents
Identifieur interne : 002C57 ( Main/Exploration ); précédent : 002C56; suivant : 002C58Segmentation and classification of mixed text/graphics/image documents
Auteurs : Kuo-Chin Fan [République populaire de Chine] ; Chi-Hwa Liu [République populaire de Chine] ; Yuan-Kai Wang [République populaire de Chine]Source :
- Pattern Recognition Letters [ 0167-8655 ] ; 1993.
Abstract
In this paper, a feature-based document analysis system is presented which utilizes domain knowledge to segment and classify mixed text/graphics/image documents. In our approach, we first perform a run-length smearing operation followed by the stripe merging procedure to segment the blocks embedded in a document. The classification task is then performed based on the domain knowledge induced from the primitives associated with each type of medium. Proper use of domain knowledge is proved to be effective in accelerating the segmentation speed and decreasing the classification error. The experimental study reveals the feasibility of the new technique in segmenting and classifying mixed text/graphics/image documents.
Url:
DOI: 10.1016/0167-8655(94)90110-4
Affiliations:
Links toward previous steps (curation, corpus...)
- to stream Istex, to step Corpus: 000918
- to stream Istex, to step Curation: 000908
- to stream Istex, to step Checkpoint: 001F52
- to stream Main, to step Merge: 002E24
- to stream Main, to step Curation: 002C57
Le document en format XML
<record><TEI wicri:istexFullTextTei="biblStruct"><teiHeader><fileDesc><titleStmt><title>Segmentation and classification of mixed text/graphics/image documents</title>
<author><name sortKey="Fan, Kuo Chin" sort="Fan, Kuo Chin" uniqKey="Fan K" first="Kuo-Chin" last="Fan">Kuo-Chin Fan</name>
</author>
<author><name sortKey="Liu, Chi Hwa" sort="Liu, Chi Hwa" uniqKey="Liu C" first="Chi-Hwa" last="Liu">Chi-Hwa Liu</name>
</author>
<author><name sortKey="Wang, Yuan Kai" sort="Wang, Yuan Kai" uniqKey="Wang Y" first="Yuan-Kai" last="Wang">Yuan-Kai Wang</name>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:123A7198BEC7D9CD578696A38B6DD150816C6241</idno>
<date when="1994" year="1994">1994</date>
<idno type="doi">10.1016/0167-8655(94)90110-4</idno>
<idno type="url">https://api.istex.fr/document/123A7198BEC7D9CD578696A38B6DD150816C6241/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">000918</idno>
<idno type="wicri:Area/Istex/Curation">000908</idno>
<idno type="wicri:Area/Istex/Checkpoint">001F52</idno>
<idno type="wicri:doubleKey">0167-8655:1994:Fan K:segmentation:and:classification</idno>
<idno type="wicri:Area/Main/Merge">002E24</idno>
<idno type="wicri:Area/Main/Curation">002C57</idno>
<idno type="wicri:Area/Main/Exploration">002C57</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title level="a">Segmentation and classification of mixed text/graphics/image documents</title>
<author><name sortKey="Fan, Kuo Chin" sort="Fan, Kuo Chin" uniqKey="Fan K" first="Kuo-Chin" last="Fan">Kuo-Chin Fan</name>
<affiliation wicri:level="1"><country xml:lang="fr" wicri:curation="lc">République populaire de Chine</country>
<wicri:regionArea>Institute of Computer Science and Electronic Engineering, National Central University, Chung-Li, Taiwan</wicri:regionArea>
<wicri:noRegion>Taiwan</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Liu, Chi Hwa" sort="Liu, Chi Hwa" uniqKey="Liu C" first="Chi-Hwa" last="Liu">Chi-Hwa Liu</name>
<affiliation wicri:level="1"><country xml:lang="fr" wicri:curation="lc">République populaire de Chine</country>
<wicri:regionArea>Institute of Computer Science and Electronic Engineering, National Central University, Chung-Li, Taiwan</wicri:regionArea>
<wicri:noRegion>Taiwan</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Wang, Yuan Kai" sort="Wang, Yuan Kai" uniqKey="Wang Y" first="Yuan-Kai" last="Wang">Yuan-Kai Wang</name>
<affiliation wicri:level="1"><country xml:lang="fr" wicri:curation="lc">République populaire de Chine</country>
<wicri:regionArea>Institute of Computer Science and Electronic Engineering, National Central University, Chung-Li, Taiwan</wicri:regionArea>
<wicri:noRegion>Taiwan</wicri:noRegion>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series><title level="j">Pattern Recognition Letters</title>
<title level="j" type="abbrev">PATREC</title>
<idno type="ISSN">0167-8655</idno>
<imprint><publisher>ELSEVIER</publisher>
<date type="published" when="1993">1993</date>
<biblScope unit="volume">15</biblScope>
<biblScope unit="issue">12</biblScope>
<biblScope unit="page" from="1201">1201</biblScope>
<biblScope unit="page" to="1209">1209</biblScope>
</imprint>
<idno type="ISSN">0167-8655</idno>
</series>
<idno type="istex">123A7198BEC7D9CD578696A38B6DD150816C6241</idno>
<idno type="DOI">10.1016/0167-8655(94)90110-4</idno>
<idno type="PII">0167-8655(94)90110-4</idno>
</biblStruct>
</sourceDesc>
<seriesStmt><idno type="ISSN">0167-8655</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass></textClass>
<langUsage><language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">In this paper, a feature-based document analysis system is presented which utilizes domain knowledge to segment and classify mixed text/graphics/image documents. In our approach, we first perform a run-length smearing operation followed by the stripe merging procedure to segment the blocks embedded in a document. The classification task is then performed based on the domain knowledge induced from the primitives associated with each type of medium. Proper use of domain knowledge is proved to be effective in accelerating the segmentation speed and decreasing the classification error. The experimental study reveals the feasibility of the new technique in segmenting and classifying mixed text/graphics/image documents.</div>
</front>
</TEI>
<affiliations><list><country><li>République populaire de Chine</li>
</country>
</list>
<tree><country name="République populaire de Chine"><noRegion><name sortKey="Fan, Kuo Chin" sort="Fan, Kuo Chin" uniqKey="Fan K" first="Kuo-Chin" last="Fan">Kuo-Chin Fan</name>
</noRegion>
<name sortKey="Liu, Chi Hwa" sort="Liu, Chi Hwa" uniqKey="Liu C" first="Chi-Hwa" last="Liu">Chi-Hwa Liu</name>
<name sortKey="Wang, Yuan Kai" sort="Wang, Yuan Kai" uniqKey="Wang Y" first="Yuan-Kai" last="Wang">Yuan-Kai Wang</name>
</country>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 002C57 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 002C57 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Ticri/CIDE |area= OcrV1 |flux= Main |étape= Exploration |type= RBID |clé= ISTEX:123A7198BEC7D9CD578696A38B6DD150816C6241 |texte= Segmentation and classification of mixed text/graphics/image documents }}
This area was generated with Dilib version V0.6.32. |